Online Product Quantization

نویسندگان

  • Donna Xu
  • Ivor W. Tsang
  • Ying Zhang
چکیده

Approximate nearest neighbor (ANN) search has achieved great success in many tasks. However, existing popular methods for ANN search, such as hashing and quantization methods, are designed for static databases only. They cannot handle well the database with data distribution evolving dynamically, due to the high computational effort for retraining the model based on the new database. In this paper, we address the problem by developing an online product quantization (online PQ) model and incrementally updating the quantization codebook that accommodates to the incoming streaming data. Moreover, to further alleviate the issue of large scale computation for the online PQ update, we design two budget constraints for the model to update partial PQ codebook instead of all. We derive a loss bound which guarantees the performance of our online PQ model. Furthermore, we develop an online PQ model over a sliding window with both data insertion and deletion supported, to reflect the real-time behaviour of the data. The experiments demonstrate that our online PQ model is both time-efficient and effective for ANN search in dynamic large scale databases compared with baseline methods and the idea of partial PQ codebook update further reduces the update cost.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

فرمولبندی هندسی کوانتش تغییرشکل برزین

  In this paper we try to formulate the Berezin quantization on projective Hilbert space P(H) and use its geometric structure to construct a correspondence between a given classical theory and a given quantum theory. It wil be shown that the star product in berezin quantization is equivalent to the Posson bracket on coherent states manifold M, embodded in P(H), and the Berezin method is used to...

متن کامل

High-resolution product quantization for Gaussian processes under sup-norm distortion

We derive high-resolution upper bounds for optimal product quantization of pathwise contionuous Gaussian processes respective to the supremum norm on [0, T ]. Moreover, we describe a product quantization design which attains this bound. This is achieved under very general assumptions on random series expansions of the process. It turns out that product quantization is asymptotically only slight...

متن کامل

Evaluation of Product Quantization for Image Search

Product quantization is an effective quantization scheme, with that a high-dimensional space is decomposed into a Cartesian product of lowdimensional subspaces, and quantization in different subspaces is conducted separately. We briefly discuss the factors for designing a product quantizer, and then design experiments to comprehensively investigate how these factors influence performance of ima...

متن کامل

Accelerated Distance Computation with Encoding Tree for High Dimensional Data

We propose a novel distance to calculate distance between high dimensional vector pairs, utilizing vector quantization generated encodings. Vector quantization based methods are successful in handling large scale high dimensional data. These methods compress vectors into short encodings, and allow efficient distance computation between an uncompressed vector and compressed dataset without decom...

متن کامل

Improving Bilayer Product Quantization for Billion-Scale Approximate Nearest Neighbors in High Dimensions

The top-performing systems for billion-scale high-dimensional approximate nearest neighbor (ANN) search are all based on two-layer architectures that include an indexing structure and a compressed datapoints layer. An indexing structure is crucial as it allows to avoid exhaustive search, while the lossy data compression is needed to fit the dataset into RAM. Several of the most successful syste...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1711.10775  شماره 

صفحات  -

تاریخ انتشار 2017